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Compositions of AMP-1 8 isolated from mouse and pig antrum tissue stimulate 
growth of confluent stomach, intestinal, and kidney epithelial cells in culture; human, 
monkey, dog and rat cells are also shown to respond. This mitogenic (growth 
stimulating) effect is inhibited by specific antisera (antibodies) to AMP-1 8, supporting 
the conclusion that AMP- 18, or its products, e.g. peptides derived from the protein by 
isolation of segments of the protein or synthesis, is a growth factor. Indeed, certain 
synthetic peptides whose amino acid sequences represent a central region of the AMP- 1 8 
protein also have growth-factor activity. The peptides also speed wound repair in tissue 
culture assays, indicating a stimulatory effect on cell migration, the process which 
0 mediates restitution of stomach mucosal injury. Thus, the protein and its active peptides 
are motogens. Unexpectedly, peptides derived from sub-domains of the parent molecule 
can inhibit the mitogenic effect of bioactive synthetic peptides and of the intact, natural 
protein present in stomach extracts. 

There are 3 activities of the gastrokine proteins and peptides of the present 
5 invention. The proteins are motogens because they stimulate cells to migrate. They are 
mitogens because they stimulate cell division. They function as cytoprotective agents 
because they maintain the integrity of the epithelium (as shown by the protection 
conferred on electrically resistant epithelial cell layers in tissue culture treated with 
damaging agents such as oxidants or non-steroidal anti-inflammatory drugs NSAIDs). 
20 The invention relates a group of isolated homologous cellular growth stimulating 

proteins designated gastrokines, that are produced by gastric epithelial cells and include 
the amino acid sequence VKEK7QKKXXGKGPGGXPPPK MHI An 
isolated protein of the group has an amino acid sequence as shown in FIG. 7. The protein 
present in pig gastric epithelia in a processed form lacking the 20 amino acids which 
25 constitute a signal peptide sequence, has 1 65 amino acids and an estimated molecular 
weight of approximately 1 8kD as measured by polyacrylamide gel electophoresis. Signal 
peptides are cleaved after passage through endoplasmic reticulum (ER). The protein is 
capable of being secreted. The amino acid sequence shown in FIG. 3 was deduced from 
a human cDNA sequence. An embodiment of the protein is shown with an amino acid 
30 sequence as in FIG. 6, a sequence predicted from mouse RNA and DNA. 
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A growth stimulating (bioactive) peptide may be derived from a protein of the 
gastrokine group. Bioactive peptides rather than proteins are preferred for use because 
they are smaller, consequently the cost of synthesizing them is lower than for an entire 
protein. 

5 In addition, a modified peptide may be produced by the following method: 

(a) eliminating major protease sites in an unmodified peptide amino acid 
sequence by amino acid substitution or deletion; and/or 

(b) introducing into the modified amino acid analogs of amino acids in the 
unmodified peptide. 

10 An aspect of the invention is a synthetic growth stimulating peptide, having a 

sequence of amino acids from positions 78 to 1 19 as shown in FIG. 3. 

Another peptide has a sequence of amino acids from position 97 to position 1 1 7 
as shown in FIG. 3. 

Another peptide has a sequence of amino acids from position 97 to position 121 

15 as shown in FIG. 3. 

Another peptide has a sequence of amino acids from position 1 04 to position 1 1 7 

as shown in FIG. 3. 

An embodiment of an isolated bioactive peptide has one of the following 
sequences: LDTMVKEQK. .GKGPGGAPPKDLMY SSMKIMIllil or 
20 KKLQGKGPGGPPPK p|p^^plS^^)M- embodiment of an inhibitor of a protein 
of the gastrokine group has the amino acid sequence KKTCIVHKMKK gf^SSS^I 
4^or KKEVMPSIQSLDALVKEKK ■^HWI- (see also Table 1} 

The invention also relates a pharmaceutical composition including at least a 

growth stimulating peptide. 
25 A pharmaceutical composition for the treatment of diseases associated with 

overgrowth of gastric epithelia, includes an inhibitor of a protein of the group of 

gastrokines or of a growth stimulating peptide derived from the gastrokine proteins. 

A pharmaceutical composition for the treatment of diseases of the colon and small 

intestine includes at least a growth stimulating peptide of the present invention. 
30 Examples of such diseases include ulcerative colitis and Crohn's Disease. 
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(b) providing environmental conditions allowing migration of the epithelial 

cells. 

A method for cytoprotection of damaged epithelial cells in the gastrointestinal 
tract of mammals, includes the following steps: 
5 (a) contacting the damaged epithelial cells with a composition including a 

protein of the gastrokine group or a peptide derived from the protein; and 
(b) providing environmental conditions allowing repair of the epithelial cells. 
The damaged cells may form an ulcer. 
BRIEF DESCRIPTION OF THE DRAWINGS 
10 FIG. 1 is a human genomic nucleotide sequence^^fS^M® of a P re " 

gastrokine; sequence features were determined from cDNA and PCR of human genomic 
DNA amph-ge8.seq Length: 7995 predicted promoter: 1405; exon 1: 1436-1490; exon 
2: 4292-4345; exon 3: 4434-4571; exon 4: 5668-5778; exon 5: 6709-6856; exon 6: 
7525-7770; polyA site: 7751. 
15 FIG. 2 is a human cDNA sequence^fl^KMiPIIW; the DNA clone was 

obtained by differential expression cloning from human gastric cDNA libraries. 

FIG. 3 is a human preAMP-1 8 protein sequence ^iE^l^g^^f^predicted 

from a cDNA clone based on Powell (1987) and revised by the present inventors; N-21 

« 

is the expected N-terminus of the mature protein. 
20 FIG. 4 is a mouse preAMP-18 sequence g||^^®|S^ffl^S determined from 

RT-PCR of mRNA and PCR of BAC-clones of mouse genomic DNA sequences: 

predicted promoter: 1874 experimental transcription start site: 1906 translation 

initiation site: 1945 CDS 1 : 1906-1956; CDS 2: 3532-3582; CDS 3: 3673-381 3; CDS 4: 

4595-4705; CDS 5: 5608-5749; CDS 6: 6445-6542; polyA site: 6636. 
25 FIG. 5 is a mouse cDNA sequence for preAMP-18. 

FIG. 6 is mouse preAMP- 1 8 amino acid sequence^^p||HilSip; RT-PCR 

performed on RNA isolated from mouse stomach antrum: _Y-21 is the predicted N- 

terminus of the mature protein; the spaces indicated by mean there are no nucleotides 

there to align with other sequences in FIG. 1 1 . 
30 FIG. 7 is a [pig genomic DNA related to the cDNA] — cDNA 

expressing porcine AMP- 18. 
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[FIG. 8 is the cDNA pig sequence of AMP- 18. *Based on Powell (1987). D-21 
is the N-terminus of the mature protein - confirmed by sequencing of the protein isolated 
from pig stomach.] 

FIG. [9] --8-- is pig pre-gastrokine (pre- AMP- 18) protein sequence --(flfPID 
5 ]i^)jg^D.^predicted from cDNA clone based on Powell (1987) D-21 is the N-terminus 
of the mature protein - confirmed by sequencing of the protein isolated from pig stomach. 
FIG. [10] —9-- is a comparison between the amino acid sequences of human -- 

^IliM^M^versus pig iiMMllilSpre-gastrokine. 

FIG. [1 1] -10— shows a computer-generated alignment comparison of humanM 

i o mm^msmA p*g wmsssmm w and ™ use mmws^3» rGdictGd 

protein sequences determined from sequencing of cDNA clones for human and pig AMP- 
18, and by polymerase chain reaction of mouse RNA and DNA using preAMP-18 
specific oligonucleotide primers; in each case the first 20 amino acids constitute the 
signal peptide, cleaved after passage through the endoplasmic reticulum membrane. 

15 FIG. [12] -11- shows the effect of porcine gastric antrum mucosal extract, 

human AMP peptide 77-97, and EGF on growth of gastric epithelial cells; AGS cells 
were grown in DMEM containing fetal bovine serum (5%) in 60-mm dishes; different 
amounts of pig antrum extract, HPLC purified peptide 77-97, and/or EGF were added; 
four days later the cells were dispersed and counted with a hemocytometer; antrum 

20 extract and peptides each stimulated cell growth in a concentration-dependent manner; 
the bar graph shows that at saturating doses, peptide 77-97 (8g/ml) or EGF (50ng/ml) was 
mitogenic; together they were additive suggesting that the two mitogens act using 
different receptors and/or signaling pathways; anti-AMP antibodies inhibited the antrum 
extract but did not inhibit peptide 77-97. 

25 FIG. [13] -12- shows the structure of the human and mouse preAMP-18 genes; 

the number of base pairs in introns are shown above the bars; exons are indicated E1-E6 
and introns 11-15; there are minor differences in intron length. 
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BIOACTIVITY OF SYNTHETIC PEPTIDES BASED ON THE 
SEQUENCE OF GASTROKINE (AMP-18) 



10 



15 



20 



25 



30 



Name of 
Peptide, 
Sequence 
in Human 



78-1 19 



78-88 



87-105 



104-U7 



104-U 



97-1 17 



97-117** 



97-121 



109-1 17 



104-109 



1 10-1 13 



mouse 
97-119 



#AA 



42 



19 



14 



18 



21 



21 



25 



23 



AMINO ACID SEQUENCE 



KKTCIVHKMKK-EVMPSlQSLDALVKEiCKLQGKGPGGPPPKGL^ 
(SEQ51D?N0:*6>- 



1CKTCIVH1CM1CK BH§fflS«" 



KKEVMPSIQSLDALVKEKK ^Si§ttSlMg^^ 



KKLQGKGPGGPPP KI!(S EQ 



KKLQGKGPGGPPP KGLMY § 



LDALVKEKKLQGKGPGGPPPK--(SEQ ID 



NO:8)~ 



GKPLGQPGKVPKLDGKEPLAK --(SEQ lDNO:9)- 



LDALVKEKKLQGKGPGGPPPKGLMY - 



(SEQ IDNO:10)» 



KGPGGPPPK -(portion of 



SEQ IDNO:10)- 



KK LQGK --(portion of SEQ ID 



NO: 10)-- 



GPGG -(portion of SEQ ID NO: 



10)-- 



LDTMVKEQK. GKGPGGAPPKDLMY --(SEQ ID 



NO:2)- 



K„„uM 



0.3 



Inactive 



0.8 



1.0 



03 



Inactive 



0.2 



2 5 



7.4 



Inactive 



0.2 



Table 1: Analysis of mitogenic peptides derived from the human and mouse gastrokine 
(AMP-18) sequence. A 14 amino acid mitogenic domain is in bold type. *Peptides are 
identified by their position in the amino acid sequence of the pre-gastrokine (preAMP-18). 
#AA; number of ammo acids in a peptide. K 1/2 ; concentration for half-maximal growth 
stimulation. 

Overlapping inactive peptides can inhibit the activity of the mitogenic peptides: that is, human 
peptides 78-88 and 87-105 block the activity of peptide 78-119, and while peptide 87-105 blocks 
theactiviry of peptide 104-1 17, the peptide 78-88 does not. Peptides 78-88 and 87- 
105 block the activity of the protein in stomach extracts. 
* ^scrambled 
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WE CLAIM: 

1. A group of isolated homologous cellular growth stimulating proteins 
designated gastrokines, said proteins produced by gastric epithelial cells and comprising 
the amino acid sequence 

5 2. An isolated protein from the group of claim 1, said protein further 

characterized as comprising an amino acid sequence as in FIG. 7, present in pig gastric 
epithelia in a processed form lacking the 20 amino acids which constitute a signal peptide 
sequence, having 165 amino acids and an estimated molecular weight of approximately 
18kD as measured by polyacrylamide gel electophoresis, said protein capable of being 

10 secreted. 

3. A protein from the group of claim 1 , further characterized as comprising 
an amino acid sequence as in FIG. 3, said sequence deduced from a human cDNA. 

4. A protein from the group of claim 1 , further characterized as comprising 
an amino acid sequence as in FIG. 6, said sequence predicted from mouse RNA and 

15 DNA. 

5. A growth stimulating peptide derived from a protein of claim 1 . 

6. A modified peptide produced by the method comprising the following 
steps: 

(a) eliminating major protease sites in an unmodified peptide amino 
20 acid sequence by amino acid substitution or deletion in the 

unmodified peptide derived from a protein of claim 1; and 

(b) optionally introducing amino acid analogs of amino acids in the 
unmodified peptide. 

7. A synthetic growth stimulating peptide, having a sequence of amino acids 
25 from positions 78 to 119 as shown in FIG. 3. 

8. The synthetic growth stimulating peptide of claim 7, said peptide having 
a sequence of amino acids from position 97 to position 1 17 as shown in FIG. 3. 

9. The synthetic growth stimulating peptide of claim 7, said peptide having 
a sequence of amino acids from position 97 to position 121 as shown in FIG. 3. 

30 1 0. The synthetic growth stimulating peptide of claim 7, said peptide having 

a sequence of amino acids from position 104 to position 1 17 as shown in FIG. 3. 

11. An isolated bioactive peptide comprising a sequence selected from the 
group consisting of ILDTMVKEQK..GKGPGGAPPKDLMY and 
KJCLQGKCPGGPPPK --(SgQlD NQ- 
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12. An inhibitor of a protein of claim 1 , said inhibitor selected from the group 
of peptides having an amino acid sequence consisting of KKTCIVHKMKK g£SE0fl@ 
MQ: ^-t, and KJCEVMPS1QSLDALVKLEKK -(^^MM^to- 

13. A composition used for the treatment of ulcers, said composition 
including at least a growth stimulating peptide of claim 5. 

14. A pharmaceutical composition for the treatment of diseases associated 
with overgrowth of gastric epithelia, said compositions comprising an inhibitor of a 
protein of the group of claim 1 or of a growth stimulating peptide of claim 5. 

15. A pharmaceutical composition for the treatment of diseases of the colon 
and small intestine, said diseases selected from the group consisting of ulcerative colitis 
and Crohn's Disease, said composition comprising at least a growth stimulating peptide 
of claim 5. 

16. An antibody to a protein of the group of claim 1, said antibody 
recognizing an epitope within a peptide of the protein that has an amino acid sequence 
from position 78 to position 1 19 as in FIG. 3. 

17. An isolated genomic DNA molecule with the nucleotide sequence of a 
human as shown in FIG. 1 . 

18. An isolated cDNA molecule encoding a human protein, said protein 
having the amino acid sequence as shown in FIG 2. , 

19. An isolated DNA molecule comprising the genomic sequence found in 
DNA derived from a mouse, said nucleotide sequence shown in FIG. 4. 

20. A mouse with a targeted deletion in a nucleotide sequence in the mouse 
genome that when expressed without the deletion encodes a protein of the group of claim 
1. 

21. A method of making a protein from the group of claim 1 or a peptide 
derived from a protein of claim 1, said method comprising: 

(a) obtaining an isolated cDNA molecule comprising a sequence 
encoding the protein or peptide; 

(b) placing the molecule in a recombinant DNA expression vector; 

(c) transecting a host cell with the recombinant DNA expression 
vector 

(d) providing environmental conditions allowing the transfected host 
cell to produce a protein encoded by the cDNA molecule; and 

(e) purifying the protein from the host cell. 
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22. A method to stimulate growth of epithelial cells in the gastrointestinal 
tract of mammals, said method comprising : 

(a) contacting the epithelial cells with a composition comprising a 
protein from the group of claim 1 or a peptide derived from a 

5 protein of claim 1, and 

(b) providing environmental conditions for stimulating growth of the 
epithelial cells. 

23. A method to inhibit cellular growth stimulating activity of a protein of the 
group of claim 1, said method comprising: 

10 (a) contacting the protein with an inhibitor; and 

(b) providing environmental conditions suitable for cellular growth 
stimulating activity of the protein. 

24. The method of claim 23, wherein the inhibitor is an antibody directed 
toward at least one epitope of the protein, said epitope comprising an amino acid 

1 5 sequence from position 78 to position 1 19 of the deduced amino acid sequence in FIG. 
3. 

25. The method of claim 23, wherein the inhibitor is selected from the group 
of inhibitor peptides consisting ofiKXTCIVHKMKX "(SWIW and 
KJCEVMPSIQSLDALVKEKK ^Ii§i©;Mf>^- 

20 26. A method of testing the effects of different levels of expression of a 

protein of claim 1, on mammalian gastrointestinal tract epithelia, said method 
comprising: 

(a) obtaining a mouse in accord with claim 20; 

(b) determining the effects of a lack of the protein in the mouse; 

2 5 (c) administering increasing levels of the protein to the mouse; and 

(d) conelating changes in the gastrointestinal tract epithelia with the 
levels of the protein in the epithelia. 
27. A method to stimulate migration of epithelial cells after injury to the 
gastrointestinal tract of mammals, said method comprising: 
30 ( a ) contacting the epithelial cells with a composition comprising a 

protein from the group of claim 1 or a peptide derived from a 
problem of claim 1 ; and 
(b) providing environmental conditions allowing migration of the 

epithelial cells. 
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28. A method for cytoprotection of damaged epithelial cells in the 
gastrointestinal tract of mammals, said method comprising: 

(a) contacting the damaged epithelial cells with a composition 
comprising a protein of the group of claim 1 or a peptide derived 
from a protein of claim 1; and 

(b) providing environmental conditions allowing repair of the 
epithelial cells. 

29. The method of claim 28, wherein the damaged cells are an ulcer. 



ft : * 

1 AGCTTTATAA CCATGTGATC CCATCTTATG GTTTCAATCC ATGCA^AGGA 
51 GGAAAATTGT GGGCACGAAG TTTCCAAAGG GAAAATTTAT AGATTGGTAG 
101 TTAATGAAAT ACAGTTTTCC TCCTTGGCAA ATTTAATTTA CTAGCTTCAC 
151 TGTATAGGAA AAAGCAGGAA AAAAATTAAA ACCAACTCAC CTCCAAACCT 
201 GTTTTGAGCT TTTACTTGTC TGCCCAATTG ATAGTTTCTA CTCTCTGCTT 
251 TTGATGAAAA TATTTTTTAT TATTTTAATG TAACTTCTGA AAACTAAATT 
301 ATCTAGAAGC AAATAAAAAG ATATTGCTTT TATAGTTCCC AGAAGGAAAA 
351 AACAAACACT AGGAAAGTTC TATCTATCAG ATGGGGGAGA TGTGATGGAG 
4 01 GCAGTGATAT TTGAGCTGAG CCTTGAACAA TGAACAGGAG TCTACCAAGC 
4 51 GAGAGGCTAG CGGGTGGCCC TCAAGATAAA ACAACAGCAT GTACAAAGGC 
501 ATGGAGACAT ACACATCTTG ACTCTTCCAG GAATGGTGGG AACGCTGGTG 
551 GAGCTAGAAT GTAGGTACAT AGCATAAAGT GGCAGACGGG AAGCCTTTGG 
601 AAATCTTATT ACATAGGACC CTGGATGCCA TTCCAATGAC TTTGAATTTT 
651 CTGTAGGCTG CCAGCGAAAf TTCCAAGCGT GATAGAGTCA TGTCTATCTA 
7 01 TGCACTTCAG AAAGACAACC TCAGGGTTAA TGAAGAAAAT GCATTGGAAT 
7 51 ATAAGAAACT GGTGACCAGA GTGATCAATT GCATGACTGT ' TGTGAAAGTC 
801 CAGGTGAGGG GAGCTGTGGG CAAGGTCAGA GTTGAGAGGC ATTTCAGAGA 
851 TAAAATGACA GTAACTAAGT AQATGTCAGG CTGAGAAGAA AGGGCTGTAC 
901 CAGATATATG GTGCTATCAT TAAGTGAGCT CAACATTGCA GAAAAGGGGT 
951 AGGTTTGGTG GGAGTTGCTC ACAAAACATG TTTAGTCTAA GCAAAACCAT 
1001 TGCCATGGGC TCAGATAAAA GTTAAGAAGT GGAAACCATT CCTACATTCC 
1051 TATAGGAGCT GCTATCTGGA AGGCCTAGTA TACACGTGGC TTTTCAGCTG 
1101 TGATTTTGTT TGATTl'TAGG GATTATTCTT TTTCTGAATC TGAGCAATGT 



FIG. 1 



1151 TAGCGTGTAA AATACTCACA CCCACAGCTT TGACTGGGTG AGAAGTTATC 
1201 ATAAATCATA TTGAGTTTGT TGTGATACCT TCAGCTTCAA CAAGTGATGA 
1251 GTCAGGTCAA CTCCATGTGA AAGTTCCTTG CTAAGCATGC AGATATTCTG 
1301 AAAGGTTTCC TGGTACACTG GCTCATGGCA CAGATAGGAG AAATTGAGGA 

13 51 AGGTAAGTCT TTGACCCCAC CTGATAACAC CTAGTTTGAG TCAACCTGGT 

14 01 TAAGTACAAA TATGAGAAGG CTTCTCATTC AGGTCCATGC TTGCCTACTC 
14 51 CTCTGTCCAC TGCTTTCGTG AAGACAAGAT GAAGTTCACA GTGAGTAGAT 
1501 TTTTCCTTTT GAATTTACCA CCAAATGATT GGAGACTGTC AATATTCTGA 

" 1551 GATTTAGGAG GTTTGCTTCT TATGGCCCCA TCATGGAAAG TTTGTTTTAA 
1601 AAAAATTCTC TCTTCAAACA CATGGACACA GAGAGGGGAA CAACACACAC 
1651 CAGGTCCTGT TGGGGGGTGG AGAGTGAGGG GAGGGAACTT AGAGGACAGG 
17 01 TCAATAGGGG CAGCAAACCA CCATGGCACA CATATACCTA TGTAACAAAC 
17 51 CTGCACGTTC TGCACATGTA TCCCTTTTTT TTAGAAGAAG AAATAATGAA 
1801 AAAAAACCTT TTTTCTATTT ATATAATCAT GGCATTTATA AGCATCTCTA 
1851 TAGAGAAGGA TAATTGTGCT GAGATTAGAC AGCTGTCTGA GCACCTCACA 
1901 CTGACCTATT TTTAACAAAA TGACTTTCCA CATCACCTGA TTTCGGCTCC 
1951 ATGCRGGGTA AGCAGTTCCT AAGCCCTAGA AAGTGCCGAT CATCCCTCAT 
2001 TCTTGAATTC CTCCTTTTAT TTACCAAAAT TCCTGAGCAT GTTCAGGAAA 
2051 GATGAAAAGC TTATTATCAA AATAAGTGGC TGAGATAGAC TTCTTGTCAC 
2101 ATTTGTTACA GTAAAATGGG TCTCCAAGAA AGAAAGATTT GCCTTGGGCT 
2151 CTAGCATGGC CATTTATTTA AGAAAGCATC TGAAACATGA AGCTACCACA 
2 201 GCATCTCTCC TGTGGTTCCA GACGGAAGCC TGAGAGTCTA GGAGGAGGTG 
2251 GACCGAGAAA CCCTGCCAAA GTAACTAGTA GTGCCGGGTT TCTCACAACA 
2301 CGATGCAAAG GGGCTAGAAT CAGATGACTA TTTTCATGTT T C AAC AT ACT 
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2 351 ACACACTGGA AAACGTTACG GCAGACTCTA CTTTATAATG GGGCTGCAAA 
2 4 01 TGTAAAATGA CTAGTAGAAC TAGGTCCTCT TAATAGCAGC AAAGTTTAAA 
2 4 51 AGGGTCAGAG GGAGCTCCAG ACACAGGTTA GATTTGATTT CTCTCCTAGT 
2501 TCTGCTGTGA ACAAGAGGTA TAAGTTTGGC CAACTCACTT AACCCCTGAA 
2 551 GCTCAGTTAC CTTATCTGTA AAATGATTGC ATTGTACTAG GTGTTCTCTA 
2 601 AAATTTCTTC TACCTCTGAC TTTTTAGGAG ACTAATTTTT AACTCCTTTT 
2 651 TAAGCTATTG GGAGAAAAAT TTAATTTTTT TTCAAAAGTT ACCTTGAATC 
27 01 TCTAGAGCAG TTCTCAAAAC TATTTTGTCC CAGGCAAAGG AAATGAGACT 
27 51 AGGTACCCAG AATGAGGCAC CCTGCATAAA GCTCTGTGCT CTGAAAACCA 
2801 ATGTCAGGGA CCCTGTGATA AATAATTAAA CCAAGTATCC TGGGACACTG 
2851 CTAGTGACAT CGCCTCTGCT GATCACTCTT GCCAGCGAGA CACTCTATAC 
2901 TTGCTTTCTC ATCATTGGCA TCCAAACTGC CTACTAATCC ATTGCTTTGG 
2 951 AAAGTTTTTT TTAATAAAAA GATTATTTCT ATTAGGAGGA AAACATCCCA 
3001 TGTTAAATAG GAAAATTAAC TGAAATCATT TTCAGATGTG ATTTTTAGCA 
3051 CTTATAGCCA TTTCAAACCA TGGTATTCAT TTATACTATG CTATTTATTG 
3101 TAAAACTTCT TTTTTTTTCC AAGGAAAATA AGATAGTTTG CTTTATTTTA 
3151 AAACAGTAAC TTTCTTATAT TGGGGCACTG ACCAAAATTC AATACTGGTA 
3201 CAAATATGTT ACCTAGGGGG- TCAAAATATG TGCCAGGTGA ATTTTCTGAA 
32 51 TTTCTCTAAA GAGAGAATTT TAAACCTTAT AAAACAATTA GAAACAAGTG 
3301 AGTGAGAGGT GAGCATCAAC AACCTGTGTA ACATAAGCCA CAGTACAAAT 
3351 TTAAGCTGAA TAACCAAGCC ATGTCAGTTA TCCCAAATCA TTTTTGTTAA 
34 01 TATTTAGGAG GATACACATA TTTTCAATAA CTTAAAAGTG AATCTTTACT 
34 51 CCTATCTCTT AATACTCGAA GAAGTATAAC TTTCTTCTTT TACTAGATTT 
3501 AAATAATCCA AATATCTACT CAAGGTAGGA TGCTGTCATT AACTATAGCT 
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3 551 GAGTTTATCC AAAATAGAAA AATCATGAAG ATTTATAAAG CATTTTAAAA 
3 601 ATAATCATTT ATAGCAAGTC CTTGAAAGCT CTAAATAAGA AAGGCAGTTC 
3 651 TCTACTTTCT AATAACACCT ATGGTTTATA TTACATAATA TAATTCAACA 
3701 AAACAGCATT CTGACCAATG ATAATTTATA GGAAATTCAT TTGCCAAGTA 
37 51 TATGTTTTAT TATAAAGTTA ATATTTTGAC CAATCTTAAA AATTTTTAAA 
3 801 CTCTATTCTG ACATTTCCAG AAGTATTATC TTAGCAAGTC ATCTTTATGA 
3 8 51 TACCACTTAT TAAACTGAAG AGAAACAAGA TGGTACATTC TGGGTTTTAC 
v 3 901 TTTAAAAGGG ATTTGATTCA ATAATTTGAT TTATCACTAC TTGAAAATTA 

3 9 51 CATTTTCTTC CTCAGACTGG ATGGCAATGA GATGAAAGCA GCTTTCCTGG 

4 001 CTCTCAACTT CCCTTCTTCA TCAATTTTTO CAGCGTTTCA TAAGGCCTAC 
4 051 ACTAAAAATT CTAAAACTAT ATATCACATT AATATAATTA CTTATAATTA 
4101 ATCAGCAATT TCACATTATC GTTAAAACCT TTATGGTTAA AAAATGCAAG 
4151 GTAAGAGAAG AAAAAAACAC ATTGAACTAG AACTGAACAC ATTGGTAAAA 
4 201 TTAGTGAATA CTTTTCATAA GCTTGGATAG AGGAAGAAAG AAGACATCAT 
4 2 51 TTTGCCATGT AACAGGAGAC CAATGTTATT TGTGATTTCA GATTGTCTTT 
4 301 GCTGGACTTC TTGGAGTCTT TCTAGCTCCT GCCCTAGCTA ACTATGTAAG 
4 351 TCTCACCTTT TCAAGTTTGC TACCAAAATG CATTTGCAAG GAAATGTGAT 
4 4 01 ATTAAATCAC TCTCAATCTC TTATAAACTT CAGAATATCA ACGTCAATGA 
4 4 51 TGACAACAAC AATGCTGGAA GTGGGCAGCA GTCAGTGAGT GTCAACAATG 
4 501 AACACAATGT GGCCAATGTT GACAATAACA ACGGATGGGA CTCCTGGAAT 
4 551 TCCATCTGGG ATTATGGAAA TGTAGGTAGT CAACGTGCAA TTTTCACTTT 
4 601 ATTGTTTAAA AATACGACTT CTTTTTAACA AAAAATGTGC ATGTTAACCA 
4 651 TAAAGAAATT AAAAATAAAT TCTAATTACA CAT AG CAT AC AGTTATAAGT 
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4 7 01 AAAGGTGACC ATTTTGCTCA TCCGATTTTG TTCCCTAGAG ATAACTACTG 

4 7 51 TTAATAAGTG TTGCATGATC AGTTAAAATT CAAACCAACA AACACTATGT 

4 8 01 TCAAGGGATT GTGGGTATAT ACAACAAATA TGAACATCCT TTTGCCTTGC 

4 8 51 CTGCAGATAC CCTCAATAAT GCTGAAAGAC TTATACAACA TTACTGCTTC 

4 901 CAAAGCTTAG ACTATCTCAC TTTGTTTTCA AAGGAGGTTT TACGACCTTC 

4 951 TAAAGAGATT GAAATTGACA TTTCACCTAA AACTCGGGAA ATGTAAATGA 
5001 CAATATTAAT TGGTAAGAGA GGAAAGAAGA AAGAAAGAAG GAAGGAAAGA 
5051 AAGAAAGAAG GAAGGAAGGA AAGAAAGAAA GAAAGAAAGA AAGAGAGAGA 

V 

5101 AAGAAAGAAA AAGAAAAAAG AGAGAAAGAG AGAAGGAAAG AAAGAGAGAA 
5151 GGAAAGGAAA AGAGAAGCAA AGAAAGAGAG GAGCAAAGAA AGGAACACTT 
5201 AGCACTAGTT GGGAGACCCA ACTCTGGAAT TATCAGCTAT ATATTTAACA 
52 51 AACGTTATAC TTTTAAATAG CAAACTCTTT ATTGTTTCAA TTTTATCTGG 
5301 TCAATTGGAA AAATAATTTT TGTCTTATCT GTCTCCTTGA AATGTGAGGA 
5351 TCAAAGGAGA CTAAAACATG ATAGCTTTTA AAGTCTATTT CAGTAAAACA 
5 4 01 GACTTATATA GAGGGGTTTT TATCATGCTG GAACCTGGAA ATAAAGCAAA 
54 51 CCAGTTAGAT GCTCAGTCTC TGCCCTCACA GAATTGCAGT CTGTCCCCAC 
5501 AAATGTCAGC AATAGATATG ATTGCCAAGC AGTGCCCCAT CCAGTGCTCT 
5551 TATCCCAGCT CATCACGATC T-TGGAGTTCC CATTTCTCTC TGCAGGTGGA 
5 601 ACTGACCTCT GATAAGAAAA GCTCCTCGGA G AAC AC AT G C CTCACTATTT 
5 651 GCCATCTACT TTAACAGGGC TTTGCTGCAA CCAGACTCTT TCAAAAGAAG 
57 01 ACATGCATTG TGCACAAAAT GAACAAGGAA GTCATGCCCT CCATTCAATC 
57 51 CCTTGATGCA CTGGTCAAGG AAAAGAAGGT AAAAATAAAA GGCTTTTTAT 
5B01 TTTTGGTGAG GGGAGAGGTT TTACATCCTT CAGTAAATAA CGAGAAGATC 
5851 ACAGTCATTC CCTCTTGACT ACAGTATGTT GTAGTGTGCA GCACAAAGGG 

FIG. 1 Cont. 



5 901 GGAAGTTATT GGTGATTGCC TGAGGGAAGG CAACTTCTGC CACATCAAAT 
59 51 GCTGTGGCTC ACACCTACCT CTACAACCGC TGAGCAAAGC ACTTGAAACC 
6001 TTGACTGTTA GAGGAGCAAA GCTCTGGTCA CACCAATAGG AGCCTCAGTA 
6051 CTTTGCCAAG GACATTTTTC TGCAAGAGTT AGTTAGGGTT ATTAGATTTA 
6101 GCAAATGAAA ATAGAAGATA TCCAGTTAGG TTTGAATTTT AGCjTAAGCAG 
6151 CAGGTCTTTT TAGTATAATA TATCCTATGC AATATTTGGG ATATACTAAA 
6201 AAAAGATCCA TTGTTATCTG AAATTCAAAT GTAACTGGGT ATT GT AT ATT 

62 51 TTGTCTGGCC ATACTAATCC AGGTGAGTGG AAAGAAGAGA TCCATAATGT 
V V 6301 TTTAAAATAT TTGCCTGAGT TCATATTCCT ATAACTGATA AATGAGTACC 

63 51 TTTCATTGAC AAGGTAGAGA AAATAAATAA ACTGCATTCT CAGAAGATGA 

64 01 TTATTACATA GTCTAATCCA AGGAATCTAT GATGACCAAA TGAGGTCCAA 
64 51 GTTGCAGAAT AAATTAAGCC TCAGACTTCT GTGTTTATGA GAAGCTGAGG 
6501 TTTCAAACCA GGTAAATCCC TTAGGACACT TAGAAATGCT AAGATATACA 
6551 GAATAAGCTA GAAATGGCTC TTCTTCATCT TGATTATGGA AAAATTTAGC 
6601 TGAGCAACAC TCACTGTTGG CCTCGTATAC CCCTCAAGTC AACAAACCAC 
6651 TGGGCTTGGC ATTCATTCTC TCCCATTCTT CCTTTCTACd TCTCTTTTCC 
67 01 ACACTCAGCT TCAGGGTAAG GGACCAGGAG GACCACCTCC CAAGGGCCTG 

67 51 ATGTACTCAG TCAACCCAAA CAAAGTCGAT GACCTGAGCA AGTTCGGAAA 
6801 AAACATTGCA AACATGTGTC GTGGGATTCC AACATACATG GCTGAGGAGA 

68 51 TGCAAGGTGA GTAGCATCCC TACTGTGCAC CCCAAGTTAG TGCTGGTGGG 
6901 ATTGTCAGAC TATCCTCGCG CGTGTCCATA GTGGGCACCA GTGATGCAGG 

69 51 GATGGTCATC AAGGCCAACA TTTGTGCAGT GCTTGCTCTG TGCCAGGTAC 
7 001 TGTTCTATGT GCTTTAAGTG TGTTAACTCG GTTCTTCACA GCAATCTTAT 
7 051 AGGTTCTATT TTAATCCTAC TTTATGGATG AGGAAACTGA GGTACAGAGA 
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7101 GGTCACAAAA TCCTTGCCTG GGTCAATTCC AAGCATTTTG GCTGTGGATT 
7151 CTGTGCTCTT AAATATTATG GAACACTGCC TTTTAAGTGT GAATCAAGAG 
7 201 TAGACTCAAG TCATATTCAA AAGAATGCAT GAATGGCTAA ATGAAAGAAG 
7 2 51 AATGCTAATA GAATCTATTA ACTTTCTATA GCTCAGACAA TCACTTAATT 
7301 TCTGGACATT CAAAGAACAG CTGCACACAA ACAAAGtGTC TACCTAGGGA 
7 351 CCTAACTTAA TGGCAATTTT CCAGATCTCT GAATTGATTG ATTTCATCAC 
7 4 01 AACAAGTAGA TAAACCTTGA CATTAGCACA TAGCTAGTTT GGAAACCCCT 
7 4 51 ACTCCCCCAA TCCCCTCCAA GAAAAGAGTC CTTAAATAGA CATTAATATA 
7501 GGCTTCTTCT TTTCTCTTTA TTAGAGGCAA GCCTGTTTTT TTACTCAGGA 
7 551 ACGTGCTACA CGACCAGTGT ACTATGGATT GTGGACATTT CCTTCTGTGG 
7 601 AGACACGGTG GAGAACTAAA CAATTTTTTA AAGCCACTAT GGATTTAGTC 
7651 . ATCTGAATAT GCTGTGCAGA AAAAATATGG GCTCCAGTGG TTTTTACCAT 
77 01 GTCATTCTGA AATTTTTCTC TACTAGTTAT . GTTTGATTTC TTTAAGTTTC 
77 51 AATAAAATCA TTTAGCATTG AATTCAGTGT ATACTCACAT TTCTTACAAT 
7 8 01 TTCTTATGAC TTGGAATGCA CAGGATCAAA AATGCAATGT GGTGGTGGCA 
7 8 51 AGTTGTTGAA GTGCATTAGA CTCAACTGCT AGCCTATAT* CAAGACCTGT 
7 901 CTCCTGTAAA GAACCCCTTC AGGTGCTTCA GACACCACTA ACCACAACCC 

7951 TGGGAATGGT TCCAATACTC TCCTACTCCT CTGTCCACTG CTTAA EQ IbKjO.'// 



FIG. 1 Cont. 



! 

1 CATGCTTGCC TACTCCTCTG TCCACTGCTT TCGTGAAGAC AAGATGAAGT 
51 TCACAATTGT CTTTGCTGGA CTTCTTGGAG TCTTTCTAGC TCCTGCCCTA 
101 GCTAACTATA ATATCAACGT CAATGATGAC AACAACAATG CTGGAAGTGG 
151 GCAGCAGTCA GTGAGTGTCA ACAATGAACA CAATGTGGCC AATGTTGACA 
201 ATAACAACGG ATGGGACTCC TGGAATTCCA TCTGGGATTA TGGAAATGGC 
251 TTTGCTGCAA CCAGACTCTT TCAAAAGAAG ACATGCATTG TGCACAAAAT 
301 GAACAAGGAA GTCATGCCCT CCATTCAATC CCTTGATGCA CTGGTCAAGG 
351 AAAAGAAGCT TCAGGGTAAG GGACCAGGAG GACCACCTCC CAAGGGCCTG 
V 401 ATGTACTCAG TCAACCCAAA CAAAGTCGAT GACCTGAGCA AGTTCGGAAA 
'4 51 AAACATTGCA AACATGTGTC GTGGGATTCC AACATACATG GCTGAGGAGA 
501 TGCAAGAGGC AAGCCTGTTT TTTTACTCAG GAACGTGCTA CACGACCAGT 
551 GTACTATGGA TTGTGGACAT TTCCTTCTGT GGAGACACGG TGGAGAACTA 
601 AACAATTTTT TAAAGCCACT ATGGATTTAG TCATCTGAAT ATGCTGTGCA 
651 GAAAAAATAT GGGCTCCAGT GGTTTTTACC ATGTCATTCT GAAATTTTTC 
7 01 TCTACTAGTT ATGTTTGATT TCTTTAAGTT TCAATAAAAT CATTTAGCAT 
751 TG CSEQ |h MO'/^V" 




} 
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1 MKFTIVFAGLLGVFLAPALANYNIDVKDDNNNAGSGQQSVSVNNEHNVAN 50 
51 VDNNNGWDSWNSIWDYGNGFAATRLFQKKTCIVHKMKKEVMPSIQSLDAL 100 
101 VKEKKLQGKGPGGPPPKGLMYSVNPNKVDDLSKFGKNIANMCRGIPTYMA 150 
151 EEMQEASLFFYSGTCYTTSVLWIVDISFCGDTVEN 185 C -5 BQ lb / 
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1 GAATTCAAAC AGCAGGCCAT CTTTCACCAG CACTATCCGA ATCTAGCCAT 
51 ACCAGCATTC TAGAAGAGAT GCAGGCAGTG AGCTAAGCAT CAGACCCCTG 
101 CAGCCCTGTA AGCTCCAGAC CATGGAGAAG AGGAAGGTTG TGGGTTCAAG 
151 GAGCTTTTCA GAGTGGAAAT CTGTGGATCA GTGATTTATA AAACACAGTT 
201 TCCCCCTTTA TTAGATTTGA ACCACCAGCT TCAGTTGTAG AAGAGAACAG 
251 GTTAAAAAAT AATAAGTGTC AGTCAGTTCT CCTTCAAAAC TATTTTAAAC 
301 GTTTACTTAT TTTGCCAAGT GACAGTCTCT GCTTCCTCTC CTAGjGAGAAG 
351 TCTTCCCTTA TTTTAATATA ATATTTGAAA GTTTTCATTA TCTAGAGCAG 
4 01 TGGTTCTCAT CCTGTGGQCC ATGAGCCCTT TGGGGGGGTT GAACGACCCT 
4 51 TTCACAGGGG TCACATATCA GATATCCTGC ATCTTAGCTA TTTACATTAT 
501 GATTCATAAC AGTAGCAAAA TTAGTTAGGA AG T AG G AAC A AAATAACGTT 
551 ATGGTTGTGG TCACCACTAT GTTAGAGGGT CCGCAGCATT CAGAGGGTTG 
601 AGAACTGTTG TTCTAGAGGC AAATAAGAAG ACAGAGTTCC TTGATAGGGC 
651 CCAGAGGCAG ^GAAAGAAGT TTCCACGTAG AAAGTGAAGA AGGTCTGGTG 
7 01 TCCGAAGCAG TGAGGAACTT AAAAAAAGAA AACCAAAAAC ATTGCCAACT 

7 51 AACAGTCCAG GAGAAGAGCG GGGCATGAAA GGCTGAGTTd CCATGGGATG 
801 CCTTGAATGG AATCAGAGTG TGGGAAAATT GGTGTGGCTG GAAGGCAGGT 

8 51 GCCGGGCATC TCAGACGCTG GTAGCTGGGG AAACAGGAAA CCCCTTTAGG 
901 ATCCCAAGAT GCCATTCCAA TGAGCTTGAG ATTTTTCTCA TGGACTGCCA 
951 GTGAATGTTT CTACGCTCCG GAAATTAATG TTTACTTATT TTCCATATTC 
1001 TAGGGGAGAA CCCTGGGAAA AATGGAGGAC ATTCATTGAA ATATCTGAGT 
1051 CCTGGGATAA GGCAGGCTTG GTCCTACAAC TCTGGTAAAA GTCCATCAGG 
1101 AAGTGCCTTG ACCAAGGCTG GAGTGGAGAG CTGTTGGTGA GATGTAAGGG 
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1151 CAAGGTTTAG TTGCTAGATA TGTAGATGGC AAGATGGTGC TGCCAACAGC 
1201 CCCCAGAGCT CTAACCCACT GAGAAACCCA GGAATGAATG ATGGGAGATG 

12 51 GCTTTGGTGC CAGCTGCTAG TGACATGGCT GGAAAGCTGC ACTGGCTTCG 
1301 AGGCCAGACA ATTCCTCAAG GAAACATCTG GCCAGGGTGC AAGGGCCAGT 

13 51 TTCCTTCCTT GGAGTTCCTT TCACAGCTAA GAACATCATC CCCCAACCAC 

14 01 TGGTTTTGTT AAAAAGTTTT CAGTATGACT TGAGCATGGT CAAGAAGCAT 
14 51 AGAGAGGGGG AAATAAGGGT GGAAGGAGCT GGAGAAAGCT TACAATAGGA 
1501 CTGGGTAAAG GGAAGGAGAA GAAACCATTC CCGCATTCCC ATAGGAGCCA 
1551 GTACCAGGAA GGGCAGGTGT ACACACAGAT CTCATCTAAG GCCATGTTTG 
1601 GTTTAGGGAT TACTCTTCTC CCGAATCTGA GCAGCAGCAA TACGTAAAAT 
1651' ACCCACACCC ATGGCTTCCA TATTCCAGAA CTTATCACAA ACCGTGTAGA 
17 01 GTTTACTGAG ATACCTTCGT CAGAGGATGA GTCAGAGGCC TCCTGCCTAA 
17 51 GGGCCCTACT GAGCAGGCAG CTAAAGGCTT CCGGGCCTCT GCAGCTCCAC 
1801 AGATACAGGA GAGGGAAGCA GATAAGCCGT GGACTCCACC TGAGCACACC 
1851 TAGCTTGAGC AAAGCTGGTC AGGTACAAAT AGCAGAGGGC TGAATGTCTG 
1901 TGAGCACGCC GCCTGATCCT CTGCTCCACC ACACTCCTGC CGCCATGAAG 
1951 CTCACAGTAA GTCAGATCTT CTTTTCAATG CAGCACCATA CAACATTAAT 
2001 AGTCAGGGGT GAGGGGGTCT GJACTCTTACG GCACTGTTAC CATAGTGGAA 
2051 ATATTCTCCT TTCTTTTCAT GGAATCATGG TGTTTACAAG CATGTCCATA 
2101 GAGAAGAAGA ATTGCCCCGG AAGAGCCTGT CACAGGCTGA ATACTGTAGA 
2151 ATTGTCTTTC ACACCATCTG TTCCAAGGTT CTACTTAAGA CGAGCAGTCT 
2 201 CTGGGCTCCA GAAAGAGTCT TTCTTAGCCT TGATCTCTTT CTTATTTCTG 
2251 ATTTCTCCTT TCTTATCCAT GATTTCCACT TTTACCAGTT CTGGGCATGT 
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2301 TCCGGTCAGA CTGGAAGATC ACTGTTGTCA AAACTAGTCT TCAACACTCT 

2 351 TGGCTGTTAA CATGAAAACA ACGGTCCTTG GGCCCTGTGC AAGCATTTCT 

24 01 TGGAGAAAGT CTCTGGGGAT GAAGCTATCT CAGTTTCCCC ACTGAAGTCC 

24 51 TAGGATACAG AGGCTCAAAC AGAGTGCACA TATTCAATTT CAGCATACTC 

2 501 TATTGGCGCT GCTTTATGAA TCATATGAAT TTATGGAATT GGAAATGTAA 

2 551 ACTATGACCA AGAAGGGTCC ACCTCAGAAC AGGTTGGGTG GGGAACTCCA 

2 601 AGCACAGGCC AGAGGGCTGC GTTTCTCTTC TAGTTCTGTC TAGAGGAGTG 

2 651 GTTCTCGACC TTCCTAATGC TGTGACCCTT TAATACAGTT CCTCACGTTG 

27 01 TCGTGACTCC CAGCCATAAA ATTACTTTCA TTGCTACTGC ATAACTGTAA 

27 51 TTTTGCTACC ATTATGAGTT GTAATGTAAA TATCTGATAT GCAAGATACC 

28 01 AGATAACCTA AGAAACGGTT GTTTGACCTT TAAAGGGGTC ACAACCCACA 
2 851 GGTGGAGAAC TACTGGTCTA GGGTCCTTTA CAGTCCTTTA GCTGCCTCAT 
2 901 TTACAGGAGA TAACATCATG CTCAAAAACT CCCTCCACAT TTGGCTTTTT 
2 951 GGGTTGTTTT GTTTTGTTTT TCAAGACAGG GTTTCTCTGT GTAGCCCTGG 
3001 CTGTCCTGGA ACTCACCTTT GTAGACCAGG CTGGCCTCGA ACTCAGAAAT 
3051 CCGCCTGCTT CTGCCTCCTG AGCGCTGGGA TTAAAGGCGT GCGCCACCAT 
3101 GTCTGGCTCA CATCTGGeTT TTTAAGAGAC CGATTTTAAC TTCTTGCATT 
3151 GAAAATAAAT ATAGTAGAAA . TGCTTAACCT ACTAAGACAA TAAAAACAGG 
3201 ATTCCTTCTG CTAGGAAGAA CACGTTCCAG ACTAAGGAAA AAAACCTTTT 
3251 CAGGGCTTTC ATTACACTGT GCCATGCACT AATTTTATGT TTTCTTCATC 
3 301 AGTTTTCAGT GTCTGAAATT CAGTGTCAAA ATTCTAAGAC TACATATGAA 
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3351 TATCATTACA GTAACTCAGC AATTCTATGT TACCAGTAAG TTTTTCTGTA 
34 01 GTTTAAAAAA AAGGTGGAAG AAGAAAGCAC AGATAGTTTA GCACATGGGT 
34 51 AAAATCAGTA ACTATTTCTG ATGAGCTTGG TGAAGATGCT GTAAACCATG 
3501 CGACCACCAG TCCTGTTCTC TGTGCTTTCA GATGTTCGTC GTGGGTCTGC 
3 551 TTGGCCTCCT TGCAGCTCCT GGTTTTGCTT ACGTAAGTCT CATTTTTCTG 
3 601 AAGTTCATTG TCAAAACTGC ATTTACAGTG AAATGTGATC TTAAGTCACC 
3651 CTCTGCTTCT TATGAACATT AGACGGTCAA CATCAATGGT AATGATGGCA 

37 01 ATGTAGACGG AAGTGGACAG CATTCGGTGA GCATCAATGG TGTGCACAAC 
3751 GTGGCCAATA TCGACAACAA TAACGGCTGG GACTCCTGGA ATAGCCTCTG 
3801 GGACTATGAA AACGTATGTA ATGGACACAC AGGGTAAAGA TATGGTGTAG 

38 51 CCACCACCCA TTAAAATTTC TGAGGTGAAT TCTAGCTGTT CATGAACATT 

3 901 AAAAGCTACC AGTAAAAGTG CCCATTCCAC TCAAAACAAT TTTACTTTTT 
3951 TGCATATAAT TATTGCTAAT AAGTATTACA CAATAGGTCG AA AT T C AAAG 

4 001 GGATCAATAG TAAGGATAAA AACTATGTAC AAAG AC AAAC ACAGCATCCT 
4 051 TTGGTCTTCC CTGCAGAGAG TCTCCATGAT GTTAAAGGTC CAATGTTTTA 
4101 TGGAGGCTGA ATGAAATACG AATGCCTCTG TGATGGAAAA GGCCCAACAT 
4151 CTTATGGAGA ATGAGTGAAG TATGAATGCT ATTAGTTGTA AGAGAAGGCG 
4 201 ATGCAAAGCA ACACTTGGCA CCACCTGCCA ATTACTACTT TCCTATTTAA 
4 2 51 ATGTAGTTTA AAAAGCAAAG CCTGTCTTCC CTGCCTCCTG GAAACACTGC 
4 301 GGATGGAGGT AGACCAAGGT ATGACAGCCT TTAAAAQTTT GTCAGCAAAA 
4 351. CACTCCCCCA TACACACATA CACACACCCT CCTACTACAC TGGAACTGAA 
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4 4 01 GCAAAGGCAG TGGGTTAGAT ATATCCACCC TCTAAGAGTT TGCAGGTCAT 
4 4 51 CTATATATGA TAGCCAGAGA CACAACTGCA GGACAGCCAG ACTCTGAGCA 
4 501 CTCTCCCCAG CTCCTTGTAG CTCTGTTTCA GTGGTGACTT GTGACAAGAA 
4 551 TCCTGGGGAA CCTGTGCCTC ACTGTTCTCT GTCTTCTTTA ATAGAGTTTC 
4 601 GCTGCCACGA GACTCTTCTC CAAGAAGTCA TGCATTGTGC ACAGAATGAA 
4 651 CAAGGATGCC ATGCCCTCCC TTCAGGACCT CGATACAATG GTCAAGGAAC 
4 7 01 AGAAGGTAAA GTCCTGCCTT CTTCTTTGGA GTGACAGGAA GTCTTACAGT 
4 7 51 CTCCAGTACA CAGTGAAGTC ACCCCCATTC CCTCTTTGGT GGAGCATGAC 
4 801 AGCATGTTTG TCATGATAAA TGCCACAAAC ATGTAAAACT GTTCAGTGTC 
4 851 TGCCTGAATG GAGGGTGGCT TCCACTGTGT CAGATGCCGT GGCCCACATC 
4 901 TGCCTCTGCA GGGTCCAGTA AAGCACTGGC TATCTTGAGT GTCAGAGACC 

4 951 CAAAGGTCTG TACACTTCAG TACAAGCCCT CCATATTTCA AGGGCACACT 
5001 CCTACAGTCG TTGGGGTTAT CAGAACTAGC AAACATAGAG ACTGGATTTT 
5051 CAGATGAAAA GAAATCCTTT TTAAAGTCTA AGTATGCCTT ATACAATGTT 
5101 TGAGATATTC TCAATACTAA AAAAAAAAAA ATTGTTGCTT GCTTGAAAAT 
5151 CAAATGTAAC CAAGTGTCCT ATATCCAGTG TCAATCATGG CTGTAGTAGA 
5201 TGGGAAGAGG 1 GAGCCCGTGG TTTTCACAGT CAGACGCCTG AGTTATTCTT 
5251 CTAAGTGATA AATTGGTTCC TATAACAAGC AAGCCAGTGA ATATAAATAA 

5 301 GCTCTATCTC AGAAGTTATC CTGTAGTGCT ACCCTAGAAT CTAAGAGAGC 
5351 AAAAGTGCTT CAAATTTCAG AATAAGTTTT GCTTTGGACT TCTGTTTTTC 
5 4 01 TAAACAACTA TAACTTCAAA CCATCTAAGC CTCGTGGGAC ACTTAGAAAT 
5 4 51 ACCAAGCCAT TCAAAGCTAG AATTGTTTCT TCACCTTACT TGAAAACAAA 
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5501 ATGACAACCA AAAATTGTCC CCACTGCCCT TGTACATCTT CAGATCAGTA 
5551 AAGTCCTGGG CTCAGGGATC ATTCACTTTC TTTCTTTCCT TTCACACTCA 
5 601 ACTTCAGGGT AAAGGGCCTG GAGGAGCTCC TCCCAAGGAC TTGATGTACT 
5 651 CCGTCAACCC TACCAGAGTG GAGGACCTGA ATACATTCGG ACCAAAGATT 
57 01 GCTGGCATGT GCAGGGGCAT CCCTACCTAT GTGGCCGAGG AGATTCCAGG 

57 51 TGTGTACCCT GAGATGCTGT ATATCCCAAT GCAGTACTGA GAGAGCCATC 
5801 AGACACTCTA AAGTGTGACC ACAGACGGAC CAATCATGTG GATTATCAGA 

58 51 GCAAACACTT GCTTGCTCCT TGTCAGACAG TTGTCCATGC TTCAAAAGTT 
5901 CATTAAAAAA AATAGTTCAC AGGCTCCTCA CAGAAACCTT AGTAGAATCC 

■5951 ACAGCTTCTG CTCTTAGTCT TACTTTTTAG AAACTGAGAC CCAGAGAAAG 
6001 GTCACAAAAC TTTTGTCTGG CTCAGGTTCT ATGTCTTTAA CTTTATAGAA 
6051 TACCGTCTTT* CTGGGTGGGT GGGCTCTAGA GTAAACTTCA AGTGAGTTCA 
6101 AGGAAAGCAT GAGAAGTAGG GAAGACCAAA TGAAAGGAGA ATGCCAATGA 
6151 AATCTATCGA TTCTATAGCG CCAATGCTTA ACTCCTAGGC GTTCAAAGAA 
6201 TAGTATCCAC AAGGTGTCAG CCTAAGATCC TAATCTAACA GCAAGTTTTC 
62 51 AGATCTCTGA AGTGAAAAGA GAAAGCAAGA GAGGAACAGA GACAGAAACA 
6301 GTAAGAGACA GAGAGGCAGA GACAAAGAGA CAGGGAGAAT AGAGAGGGAT 
6351 TAAAATTAAT ATATAGTTTA GAAATTACGA CTCCTCACAG TCCCTGCAGA 
64 01 GTCCTAGGAT AGGCACTGAT TTGGACTTCT TTTCTTCTCA CTAGGACCAA 
1 64 51 ACCAGCCTTT GTACTCAAAG AAGTGCTACA CAGCTGACAT ACTCTGGATT 
6501 CTGCGGATGT CCTTCTGTGG AACATCAGTG GAGACATACT AGAAGTCACA 
6551 GGAAAACAAC CCGTGGGCTC TGACCATCGC AATGCTTGAT TATGAGAGTG 
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6 601 TTCTCTGGGG GTTGTGATTA GCTTCTTTAA GGCTCAATAA ACCCACGTGG 



6701 ATGTGGCACC TGCCAGCCTG TATTCAGGAC CTCTCCGCTA TAAAGCATCC 

67 51 CTCCAGAQTT TTCAAATACT ACAAAGCACA GCCTGGGTTT GGGCTCAGAT 

6801 AGGCCACTGC TGCCTGACTA CATTACAGAC AAACAAGTTT TAAAAGAAAG 

6851 AAAAAAGAGC TCAGAGTGGC TGGAATCAGC AAGGGTGTTT TTCCTGCAAG 

6901 GAGCCAGAAG TATCAATAAT CACCCAAGGA GGAGACACTG GGAATGAGAG 

6951 ACTAGAACAC ACGCCTGCAG ATACGGAGAA CCTCAGCATT GCCGCTCTCT 

v 7001 CCCATAACTG CACACCCCCT TCTGTAAACT CTGCTTCTTT CTTTCACCTG 

'7 051 AAGATGGCCC TTGCTTTTTT TTATTATAGG ACANGATAAC TAGACCAGAA 

7101 AGTCAACCTG ACTCTCTACA TTTATATGTC TTCCCAGNTC AAGAAATATT 

7151 ATTTACTGGT GAATGGGACT TCTATATTCC CTTGGTTCAA TAAGTCTACA 

7 2 01 GGATCCATTC ATTGACAGGG CAAGAGTGAG ATCACATGAT ACCCAAGCAC 



6651 



CAGCACATCC AGTTTGTAAT GACATGCCTC. ATGACTTCTA TGGGAGTCCA 



7251 



ATGGGTCTTT CCTTGAAGGA GAAGGATCCA 
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1 ATGTTCGTCGTGGGTCTGCTTGGCCTCCTTGCAGCTCCTGGTTTTGCTTACACGGTCAAC 

61 ATCAATGGTAATGATGGCAATGTAGACGGAAGTGGACAGCATTCGGTGAGCATCAATGGT 

121 GTGCAGAACGTGGCCAATATCGACAACAAT AAeGGCTGGGACTCCTGGAATAGCCTCTGG 

181 GACTATGAAAACAGTTTCGCTGCCACGAGACTCTTCTCCAAGAAGTCATGCATTGTGCAC 

2 41 AGAATGAACAAGGATGCCATGCCCTCCCTTCAGGACCTCGATACAATGGTCAAGGAACAG 
301 AAGGGTAAAGGGCCTGGAGGAGCTCCTCCCAAGGACTTGATGTACTCCGTCAACCCTACC 

3 61 AGAGTGGAGGACCTGAATACATTCGGACCAAAGATTGCTGGCATGTGCAGGGGCATCCCT 
A 41 ACCTATGTGGCCGAGGAGATTCCAGGACCAAACCAGCCTTTGTACTCAAAGAAGTGCTAC 
501 ACAGCTGACAT ACTCTGGATTCTGCGG ATGTCCTTTTGTGGAACATCAGTGGAGACAT AC 



561 



TAG 




FIG. 5 




1 MKLTMFWGL LGLLAAPGFA YTVNINGNDG NVDGSGQQSV SINGVHNVAN 
51 IDNNNGWDSW NSLWDYENSF AATRLFSKKS CIVHRMNKbA MPSLQDLDTM 
101 VKEQKGKGPG GAPPKDLMYS VNPTRVEDLN TFGPKIAGMC RGIPTYVAEE 

151 IPGPNQPLYS KKCYTADILW ILRMS FCGTS VETY (S£Q |6 * ' (?, 
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1 atgcctgact tctcacttca ttgcattggt gaagccaaga tgaagttcac 

51 aattgccttt gctggacttc ttggtgtctt cctgactcct. gcccttgctg 

101 actatagtat cagtgtcaac gacgacggca acagtggtgg aagtgggcag 

151 cagtcagtga gtgtcaacaa tgaacacaac gtggccaacg ttgacaataa 

201 caatggatgg aactcctgga atgccctctg ggactataga actggctttg 

251 ctgtaaccag actcttcgag aagaagtcat gcattgtgca caaaatgaag 

301 aaggaagcca tgccctccct tcaagccctt gatgcgctgg tcaaggaaaa 

351 gaagcttcag ggtaagggcc cagggggacc acctcccaag agcctgaggt 

401 actcagtcaa ccccaacaga gtcgacaacc tggacaagtt tggaaaatcc 

451 atcgttgcca tgtgcaaggg gattccaaca tacatggctg aagagattca 

501 aggagcaaac ctgatttcgt actcagaaaa gtgcatcagt gccaatatac 

551 tctggattct taacatttcc ttctgtggag gaatagcgga gaactaa ^5 £ Q / 1> MO-* A?V 
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1 MKFTI AFAGL LGVFLTPALA DYSISVNDDG NSGGSGQQSV SVNNEHNVAN 

51 VDNNNGWNSW NALWDYRTGF AVTRLFEKKS CIVHKMKKEA MPSLQALDAL 

101 VKEKKLQGKG PGGPPPKSLR YSVNPNRVDN LDKFGKSIVA MCKGIPTYMA 

151 EEIQGANLIS YSEKCISANI LWILNISFCG GIAEN - - ( S E Q lb * 
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Human 1 MKFTIVFAGLLGV FLAPALANYNI DVNDDNNNAGSGQQS VSVNNEHNVAN 50 

pi 9 1 MKFTIAFAGLLGVFLTPALADYS ISVNDDGNSGGSGQQSVSVNNEHNVAN 50 

51 VDNNNGWDSWNSIWDYGNGFAATRLFQKKTCIVHKMKKEVMPSIQSLDAL 100 

51 VDNNNGWNSWNALWSYRTGFAVTRLFRKKSCIVHKMKKEAMPSLQALDAL 100 

■ 101 VKEKKLQGKGPGGPPPKGLMYSVNPNKVDDLSKFGKNIANMCRGIPTYMA 150 

101 VKEKKLQGKGPGGPPPKSLRYSVNPNRVDNLDKFGKSIVAMCKGIPTYMA 150 

151 EEMQEASLFFYSGTCYTTSVLWIVDISFCGDTVEN IBSHLSEQ \t> NIC) ' I 3^ " 

151 EEIQGANLISYSEKCISANILWILNISFCGGIAEN 185 6S£C^ [b± |OA'|15^ 
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1 

Human MKFTIVF . AG LLGVFLAPAL 

Pig MKFTIAF . AG LLGVFLTPAL 

Mouse MKLTM.FWG LLGLLAAPGF 

51 

Human . ANVDNNNGWD SWNSIWDYGN 

Pig ANVDNNNGWN SWNALWDYRT 

Mouse ANIDNNNGWD SWNSLWDYEN 

101 

Human ALVKEKKLQG KGPGGPPPKG 

Pig ALVKEKKLQG KGPGGPPPKS 

Mouse TMVKEQK. .G KG PGGAPPKD 

151 

Human MAEEMQEASL FFYSGTCYTT 

Pig MAEEIQGANL ISYSEKCISA 

Mouse VAEEIPGPNQ PLYSKKCYTA 




I 50 

ANYNIDVN.D DNNNAGSGQQ SVSVNNEHNV 
ADYSISVN.D DGNSGGSGQQ SVSVNNEHNV 
A . YTVNINGN DGNVDGSGQQ SVSINGVHNV 

100 

GFAATRLFQK KTCIVHKMNK EVMPSIQSLD 
GFAVTRLFEK KSCIVHKMKK EAMPSLQALD 
SFAATRLFSK KSC I VHRMNK DAMPSLQDLD 

150 

LMYSVNPNKV DDLSKFGKNI ANMCRGIPTY 
LRYSVNPNRV DNLDKFGKSI VAMCKG I PTY 
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